Training Paradigms for Correcting Errors in Grammar and Usage
نویسندگان
چکیده
This paper proposes a novel approach to the problem of training classifiers to detect and correct grammar and usage errors in text by selectively introducing mistakes into the training data. When training a classifier, we would like the distribution of examples seen in training to be as similar as possible to the one seen in testing. In error correction problems, such as correcting mistakes made by second language learners, a system is generally trained on correct data, since annotating data for training is expensive. Error generation methods avoid expensive data annotation and create training data that resemble non-native data with errors. We apply error generation methods and train classifiers for detecting and correcting article errors in essays written by non-native English speakers; we show that training on data that contain errors produces higher accuracy when compared to a system that is trained on clean native data. We propose several training paradigms with error generation and show that each such paradigm is superior to training a classifier on native data. We also show that the most successful error generation methods are those that use knowledge about the article distribution and error patterns observed in non-native text.
منابع مشابه
An Evaluation of Adopting Language Model as the Checker of Preposition Usage
Many grammar checkers in rule-based approach do not handle errors that come from various usages, for example, the usages of prepositions. To study the behavior of prepositions, we introduce the language model into a grammarchecking task. A language model is trained from a large training corpus, which contains many short phrases. It can be used for detecting and correcting certain types of gramm...
متن کاملOn the Emergence of Scientific Grammar in Iran
Writing the grammar of a language is one of the most significant outputs of linguistic studies. In Iran, it is Avicenna (Ibn-e Sina) who is credited with the first such compilation of the Persian language. Understanding the weaknesses associated with the traditional trends of grammar writing in Iran, contemporary Iranian linguists adopted the modern Western approach following the Chomskyan Turn...
متن کاملScaffolding Moves by Learners in Online Interactions
Learners can collaborate with each other to achieve a lesson objective. In the collaboration, they can provide each other with guidance in order to identify mistakes and improve their achievements. With the rise of online instructions, this small-scale exploratory study aimed to see how proficient learners guided their less proficient classmates in correcting the grammatical accuracy of sentenc...
متن کاملRobust Error Correction of Continuous Speech Recognition
We present a post-processing technique for correcting errors committed by an arbitrary continuous speechrecognizer. The technique leverages our observation that consistent recognition errors arising from mismatched training and usageconditions can be modeled and corrected. We have implemented a post-processor called SPEECHPP to correct word-level errors, and we show that this post-processing te...
متن کاملScaffolding Moves by Learners in Online Interactions
Learners can collaborate with each other to achieve a lesson objective. In the collaboration, they can provide each other with guidance in order to identify mistakes and improve their achievements. With the rise of online instructions, this small-scale exploratory study aimed to see how proficient learners guided their less proficient classmates in correcting the grammatical accuracy of sentenc...
متن کامل